CP2K - Sparse Linear Algebra on 1000s of cores A dCSE Project
نویسنده
چکیده
CP2K is a freely available atomistic and molecular simulation code, able to study of a wide range of molecular and bulk materials with methods including classical potentials, density functional theory (DFT), Hartree-Fock and post-HF methods. Following two earlier dCSE projects, we report here on an additional 6 months of work to optimisise the DBCSR sparse matrix multiplication library embedded within CP2K. Efficient and scalable sparse matrix operations are shown to benefit existing users of the code by reducing time to solution for typical simulations, and has enabled development of new algorithms including for the fully linear scaling DFT based on density matrix iterations.
منابع مشابه
Improving the scalability of CP2K on multi-core systems A dCSE Project
Six months of HECToR dCSE funding was given to implement mixed-mode OpenMP parallelism in CP2K, building on the results of an earlier successful dCSE project. Improved scalability of up to 8 times as many cores was demonstrated for a small benchmark, and a larger, inhomogeneous benchmark was shown to scale up to 9000+ cores. An increase in peak performance of up to 60% was also realised on HECT...
متن کاملImproving the performance of CP2K on HECToR A dCSE Project
This report presents the results of a HECToR dCSE project to improve the performance of CP2K, a freely available and popular Density Functional Theory code, on HECToR. Building on a recently implemented domain decomposition method, further optimisation of the code was performed, and significant performance gains were measured around 30% on 256 cores (for a generally representative benchmark) an...
متن کاملImproving Communication Performance of Sparse Linear Algebra for an Atomistic Simulation Application
The design of modern parallel machines leads to powerful machines, but with complex architectures and hierarchical topologies. As a result, communication overheads associated with hardware asymmetry and interconnection network increase. In order to achieve scalable performances on these machines, it is essential to reduce communication costs on parallel applications such as CP2K. From computati...
متن کاملdCSE Fluidity-ICOM: High Performance Computing Driven Software Development for Next-Generation Modelling of the Worlds Oceans
During the course of this project dCSE Fluidity-ICOM has been transformed from a code that was primarily used on institution level clusters with typically 64 tasks used per simulation into a highly performing scalable code which can be run efficiently on 4096 cores of the current HECToR hardware (Cray XT4 Phase2a). Fluidity-ICOM has been parallelised with MPI and optimised for HECToR alongside ...
متن کاملPorting of the DBCSR Library for Sparse Matrix-Matrix Multiplications to Intel Xeon Phi Systems
Multiplication of two sparse matrices is a key operation in the simulation of the electronic structure of systems containing thousands of atoms and electrons. The highly optimized sparse linear algebra library DBCSR (Distributed Block Compressed Sparse Row) has been specifically designed to efficiently perform such sparse matrix-matrix multiplications. This library is the basic building block f...
متن کامل